Machine Translation Divergences: A Formal Description and Proposed Solution

نویسنده

  • Bonnie J. Dorr
چکیده

There are many cases in which the natural translation of one language into another results in a very different form than that of the original. The existence of translation divergences (i.e., crosslinguistic distinctions) makes the straightforward transfer from source structures into target structures impractical. Many existing translation systems have mechanisms for handling divergent structures but do not provide a general procedure that takes advantage of the systematic relation between lexical-semantic structure and syntactic structure. This paper demonstrates that a systematic solution to the divergence problem can be derived from the formalization of two types of information: (1) the linguistically grounded classes upon which lexical-semantic divergences are based; and (2) the techniques by which lexical-semantic divergences are resolved. This formalization is advantageous in that it facilitates the design and implementation of the system, allows one to make an evaluation of the status of the system, and provides a basis for proving certain important properties about the system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Machine Translation Divergences: A Formal Description and Proposed Solution

There are many cases in which the natural translation of one language into another results in a very different form than that of the original. The existence of translation divergences (i.e., crosslinguistic distinctions) makes the straightforward transfer from source structures into target structures impractical. Many existing translation systems have mechanisms for handling divergent structure...

متن کامل

Machine Translation Divergences : A Formal Description

There are many cases in which the natural translation of one language into another results in a very diierent form than that of the original. The existence of translation divergences (i.e., cross-linguistic distinctions) makes the straightforward transfer from source structures into target structures impractical. Many existing translation systems have mechanisms for handling divergent structure...

متن کامل

Tree Transducers, Machine Translation, and Cross-Language Divergences

Tree transducers are formal automata that transform trees into other trees. Many varieties of tree transducers have been explored in the automata theory literature, and more recently, in the machine translation literature. In this paper I review T and xT transducers, situate them among related formalisms, and show how they can be used to implement rules for machine translation systems that cove...

متن کامل

Efficient Inference through Cascades of Weighted Tree Transducers

Weighted tree transducers have been proposed as useful formal models for representing syntactic natural language processing applications, but there has been little description of inference algorithms for these automata beyond formal foundations. We give a detailed description of algorithms for application of cascades of weighted tree transducers to weighted tree acceptors, connecting formal the...

متن کامل

A new model for persian multi-part words edition based on statistical machine translation

Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Computational Linguistics

دوره 20  شماره 

صفحات  -

تاریخ انتشار 1994